AITopics | cocktail party

Collaborating Authors

cocktail party

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Joined Audio-Visual Speech Enhancement and Recognition in the Cocktail Party: The Tug Of War Between Enhancement and Recognition Losses

Pasa, Luca, Morrone, Giovanni, Badino, Leonardo

arXiv.org Machine LearningApr-16-2019

In this paper we propose an end-to-end LSTM-based model that performs single-channel speech enhancement and phone recognition in a cocktail party scenario where visual information of the target speaker is available. In the speech enhancement phase the proposed system uses a "visual attention" signal of the speaker of interest to extract her speech from the input mixed-speech signal, while in the ASR phase it recognizes her phone sequence through a phone recognizer trained with a CTC loss. It is well known that learning multiple related tasks from data simultaneously can improve performance than learning these tasks independently, therefore we decided to train the model by optimizing both tasks at the same time. This allowed us also to explore whether (and how) this joint optimization leads to better results. We analyzed different training strategies that reveal some interesting and unexpected behaviors. In particular, the experiments demonstrated that during optimization of the ASR phase the speech enhancement capability of the model significantly decreases and vice-versa. We evaluated our approach on mixed-speech versions of GRID and TCD-TIMIT. The obtained results show a remarkable drop of the Phone Error Rate (PER) compared to the audio-visual baseline models trained only to perform phone recognition phase.

artificial intelligence, machine learning, speech enhancement, (16 more...)

arXiv.org Machine Learning

1904.08248

Country:

Asia (0.14)
Europe > Italy > Emilia-Romagna > Modeno Province > Modena (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Google Develops AI That Can Separate Voices in a Crowd

#artificialintelligenceApr-23-2018, 16:15:49 GMT

Google Research engineers have developed a deep learning system that can separate voices from audio-visual data recorded in crowded environments. The system they developed is the equivalent of the "cocktail party" effect, a feature of the human brain that can isolate and focus on one or more particular voices in a crowd. The system is designed to work with both audio and video data at the same time. Google says it created its novel tech by feeding it over 100,000 high-quality videos of lectures and talks hosted on YouTube. All talks were given by a single speaker, with minimal background noise. They trained the AI to recognize sounds based on lip/mouth movement.

artificial intelligence, machine learning, social media, (10 more...)

#artificialintelligence

Country:

Europe > Russia (0.06)
Asia > Russia (0.06)

Industry: Information Technology > Services (0.58)

Technology:

Information Technology > Communications > Social Media (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)
Information Technology > Artificial Intelligence > Games > Go (0.40)

Add feedback

[D] Open source pre-trained deep learning model for audio source separation (cocktail party)? • r/MachineLearning

@machinelearnbotJan-1-2018, 16:10:52 GMT

Cocktail party is multiple sources of speech and non speech. Even if you can isolate speech from non speech it's still a whole another issue on how to deal with cross talk.

artificial intelligence, machine learning, social media, (7 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

The Lingo That'll Save Your Next Cocktail Party, From 'Rovables' to 'Manthreading'

WIREDApr-12-2017, 15:45:08 GMT

One of the rewards of inventing something new is that you get to name it. The name doesn't always stick; with familiarity, "horse less carriages" tend to become "automo biles" and finally mere "cars." But the original coinage stands as a wonderful snapshot of how we saw the world at a certain moment, flush with delight in new pos sibilities. And given a chance to make their mark in the lexicon, even the most sober scientists can be gleefully silly: Think of particle physics' quarks and squarks, its muons and gluons. One of the most poetic neologisms of 2016, included in our year-end round-up of the best new words, was "dark sunshine": hypothetical photons generated by (equally hypothetical) dark matter in stars.

artificial intelligence, cocktail party, manthreading, (5 more...)

WIRED

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Industry:

Health & Medicine (0.77)
Information Technology > Security & Privacy (0.31)

Technology:

Information Technology > Communications > Web (0.50)
Information Technology > Communications > Mobile (0.31)
Information Technology > Artificial Intelligence > Robots (0.31)

Add feedback

The Three Faces of Bayes

#artificialintelligenceSep-22-2016, 19:56:28 GMT

Last summer, I was at a conference having lunch with Hal Daume III when we got to talking about how "Bayesian" can be a funny and ambiguous term. It seems like the definition should be straightforward: "following the work of English mathematician Rev. Thomas Bayes," perhaps, or even "uses Bayes' theorem." But many methods bearing the reverend's name or using his theorem aren't even considered "Bayesian" by his most religious followers. Why is it that Bayesian networks, for example, aren't considered… y'know… Bayesian? As I've read more outside the fields of machine learning and natural language processing -- from psychometrics and environmental biology to hackers who dabble in data science -- I've noticed three broad uses of the term "Bayesian."

artificial intelligence, bayesian, machine learning, (16 more...)

#artificialintelligence

Country: Asia > Middle East > Jordan (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Three Faces of Bayes

#artificialintelligenceAug-29-2016, 21:05:20 GMT

artificial intelligence, bayesian, machine learning, (16 more...)

#artificialintelligence

Country: Asia > Middle East > Jordan (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback